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q . OPTIMAL HOEFFDING BOUNDS FOR DISCRETE REVERSIBLE 

MARKOV CHAINS 

>~>! By Carlos A. Leon 1 and Francois Perron 2 

Universidad de Conception and Universite de Montreal 

We build optimal exponential bounds for the probabilities of 
large deviations of sums Y^k-i f(-^k) where (Xk) is a finite reversible 
Markov chain and / is an arbitrary bounded function. These bounds 
depend only on the stationary mean E^f, the end-points of the sup- 
port of /, the sample size n and the second largest eigenvalue A of 
Qh ' the transition matrix. 

■s 

0. Introduction. Consider an ergodic Markov chain (X^) with finite state 
space E, transition matrix P and stationary distribution ir. Let / : E — > K 
satisfy min f(E) = 0, max f{E) = 1 and let fi = J f dn. From the weak law of 
large numbers we know that the empirical mean n~ 1 S n = n _1 J2k=i f(-^k) 
^ , converges to fx in probability. This result is the working principle behind all 

| Markov chain Monte Carlo (MCMC) integration techniques. The basis of 

fNj ■ MCMC dates back to the 50's with the article of Metropolis, Rosenbluth, Rosenbluth, Teller and Teller 

*^ . (1953), but it is only with today's computing power that these methods can 

give their full measure. Like in the classical Monte Carlo schemes, one way 
£2 ■ of getting insight about the above convergence is by looking at the first 

, moment E[5 n ] and the (asymptotic) variance limn _2 V[5' ri ]. There is abun- 

| dant literature covering these matters — see, for example, Peskun (1973) and 

Smith and Roberts (1993). A related problem is also to study the rate at 
which the chain approaches stationarity. Instead, our concern will be the 
stationary large deviation probabilities 

"3 1 (1) K[S n >n(iM + e) 
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As fi and / are arbitrary, this also covers the case S n < (fj, — e) so we 
can restrict ourselves to upper deviations without loss of generality. It is a 
well-known result from large deviation theory [see Dembo and Zeitouni (1998)] 
that the asymptotic rate of convergence to zero in (1) is exponential with rate 
function I v {x) = sup tgR {tx — logr^i)}, where n(t) is the largest eigenvalue 
of the matrix with coefficients P{i,j)e l ^^\ On the other hand, the litera- 
ture dealing with fixed sample size upper bounds for the above probability 
is scarce [see Gillman (1993), Dinwoodie (1995) and Lezaud (1998)] and the 
results do not compare well with the classical bounds when restricted to the 
independent case [see Hoeffding (1963)]. 

The above authors use perturbation theory for linear operators to esti- 
mate the Perron-Frobenius eigenvalue n{t) and obtain upper bounds from 
the Markov inequality through the matrix representation of the moment gen- 
erating function E[exp(tS' n )]. In particular, Lezaud (1998) obtains results for 
nonreversible and continuous chains. 

Our approach contains some elements of the later, but achieves to reduce 
the initial problem to a simpler one where exact calculations can be carried 
out. Our bounds are optimal in the sense that the exponential rate is reached 
asymptotically for a class of Markov chains. 



Theorem 1. For all pairs ((X n ),f), such that (X n ) is a finite, ergodic 
and reversible Markov chain in stationary state with second largest eigen- 
value X and f is a function taking values in [0, 1] such that E[/(AQ)] = /i, the 
following bounds, with Xq = max(0, A), hold for all e > such that [i + e < 1 
and all time n 



(2) 
(3) 

where 



\[S n >n{ji + e)\ 

jJL + fiX 



< 



l-2(/2-e)/(l + \/A) 



n(/i+e) 



H + HXq 



l-2(/i + e)/(l + v / A; 



n(fj.—e) 



n 1 — ^0 2 

< exp4 —I- —ne 



1 + A 



A = l + 



4A ( / u + £)(/i-e) 
/i/2(l - A ) 2 



fi = l-fi. 



In particular, the upper bound given by expression (3) is the large devi- 
ation rate function for a two-state chain, which, for A = 0, coincides with 
Hoeffding's bound. The bounds are optimal for A > 0. 

The paper is organized as follows. We will first solve the two-state case, 
which turns out to be the extremal case needed in the sequel. Next, we 
handle the case where the cardinality of E is finite by introducing a modifi- 
cation of the spectre of the transition matrix resulting in a new chain (X/.), 
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which will serve as a bridge between the initial chain (Xk) and the two-state 
case through a positive semidefinitness argument and a convex majorization 
result. We also compare our bound with existing bounds. Finally, countable 
chains with a spectral gap can be handled in the same manner. 

1. Solution of the two-state case. Let (Y^) be an ergodic Markov chain 
with state space {0,1}, transition matrix with second largest eigenvalue A 
and stationary distribution /x = (/Z,/x)', £l + [J> = 1, < fj, < 1. Let I be the 
identity matrix and 1 = (1,1)'. It is easily seen that A and /x completely 
specify the transition mechanism, which is then given by 

M(A,/x) = AI+(l- X)lfx'. 

Following a classical recipe, we derive a bound for the upper deviation prob- 
abilities of the empirical sums S n = J2k=i ^fc using Markov's inequality: for 
all t > 0, 

(4) P, t [5 n > n(fx + e)\ < e^+^E^exp^)]. 

The expectation on the right-hand side admits the representation [see Dinwoodie 
(1995)] 

(5) /x'[M(A,/x)A 2 ] n l, 

where D t = diag(l, e 4//2 ). Let D = diag( v / 7I, yJJL) and denote by T the or- 
thogonal matrix with columns -y l = (y/Ji, y^)' and 7 2 = (-01,,/^)'. The 
expression (5) then admits the following symmetric form: 

7 / iAG ! t"~ 1 A7i> 

where Gt = A[7i7i + ~~ 7i7i)] A- All the above expressions are derived 
from (5) using the spectral representation M(A,/i) = D^Y diag(l, A)r'D. 
We will perform this derivation in complete detail for the general case in 
the next section. Now, the largest eigenvalue 9(t) of Gt satisfies 6(t) k = 
sup|| x || =1 ||G^x|| and an application of Cauchy-Schwarz's inequality on the 
last display yields 

rf n tiAG?- 1 A7i < IIA^IIIlGr'A^II 

1 j < ll^tTill^C*)"- 1 - 

Proposition 1. For the two-state Markov chain with transitions M, 
the stationary upper- deviation probabilities satisfy 

(7) F^[S n > n(fi + e)] < || A7ill^W _1 exp{-n[i(/z + e) - log0(t)]}. 
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When A > 0, it can further be shown that 



(8) 
(9) 

where 



< 



< 



l-2(/2-e)/(VA + l) 



n(fj.+e) - 



- n(/i— e) 



l-2(/i + e)/(VA + l) 



exp 



1-A 

' ri 

1 + A 



-ne 



A = l + 



4A(^ + g)(/i-e) 
- A) 2 



Proof. Inequality (7) is obtained from (4) and (6) after rewriting the 
exponential part. Under the additional condition A > 0, the nonexponential 
term in (7) is less than 1. Indeed, 



9(t)> 



Dtlx 



IIATiii 
IIA7iII 2 + A 

2 



DtlxW 

7i^ 2 (I-7i7i)A 2 7i 



HA7ill 2 

> IIA71I" 

since I — 7i7^ is positive semidefinite. Then we have 

P M [5 n > n(fi + e)] < esxp{-n[t(ji + e) - log0(t)]}. 

Taking the infimum for t > 0, we obtain 

P/JSW > + e)] < exp{-n/ e (/i + e)}, 

where Ig(x) = sup t6R {ix — log6>(i)} [see Dembo and Zeitouni (1998), 
Lemma 2.2.5] is the rate function of the empirical averages n -1 ^. An ex- 
plicit computation of this rate function will prove (8). A simple calculation 
yields 



9(t) = ±[Tr(G t ) + v / TV 2 (G 4 )-4Ae t ], 



(10) „ W - 5I 

where Tr(-) denotes the trace. Taking < x < 1 arbitrary but fixed and 
looking for the zeros of the equation ^[tx — log#(t)] = 0, we obtain 

(11) (x - x)v / Tr 2 (G t ) - 4Ae* - [(// + ^A)e* - (/2 + fiX)] = 0. 



Multiplying by the conjugate, simplifying and expanding into powers of e t , 
a quadratic polynomial emerges whose roots are possible candidates for the 
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maximum value to, which is found to be the following (see Appendix A for 
the details): 

'(fi + fJtX)[y/K-(x-x)] 



t = log 



(fi + p,X)[y/A + (x-x)] 



where 



A 



1 + 



4Xxx 



and 



x 



1 



X. 



MA(l-A) 2 . 

Now we can evaluate expression (10) to determine the Perron eigenvalue 
9 (to), which we call simply 9, 

a (/2 + M)[%/A + l] 



' A + x — x 

This yields the rate function Ig, which after simplifying can be written as 



(12) I e (x) = -xlog 



l-2x/(\/A + l) 



xlog 



/x + //A 



l-2a;/(v/A + l) 



and the right-hand side of (8) is just the explicit form of exp{— n/g(/z + e)}. 

To prove the uniform bound (9), we will show that Ig(x) > |^ + ^| (x — /x) 2 . 

First, we differentiate g(x) = Iq(x)/(x — fj,) 2 to obtain g'(x) = (x — fi)~^h(x), 
where h(x) = (x — fi)I' e (x) — 2Ig(x). Studying the two first derivatives of the 
numerator h(x), it can be shown (see Appendix B) that 

r >0, if |z-l/2| > |a»-1/2|, 

h(x) < = 0, if x = fi or x = ft, 

(<0, if |x-l/2| < |a»-1/2|, 
and g' has the following behavior: 

!< 0, if x < Jx, 
= 0, if x = fi, 
> 0, if x > p,, 

from what we deduce that g attains a global minimum at x = ft. Now, 
a Taylor expansion for g(fi) = Ie(p,)(fl — n)~ 2 in terms of r = (/2 — — 
A)(l + A)~ 1 gives 

5 (/i) = (/i-^" 1 log[(l+r)/(l-r)] 



> 2- 



1-A 

' r' 

1 + A 

1 - A 



1 3 1 5 

3 5 



'1 + A' 

and from the definition of g we then have 

Ie(x)>g(fL)(x-f,) 2 >2^^(x 



□ 



G 
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2. General case. Let (Xj~) have transitions P = (pij) with stationary 
distribution 7r = (jri) such that 

(13) TTiPij = TTjPji 

for all i,j in the finite and ordered state space (E, <). (It is convenient to 
leave the order < unspecified.) Consider now a bounded function f:E—> 
R with mm f(E) = 0, max f(E) = 1 and such that the stationary mean 
E 7r [/(Xfc)] is equal to a fixed number fi. Applying an obvious affine trans- 
formation we can always set mm f(E) = 0, and max f(E) = 1; this has no 
bearing on our argumentation and will only make the expressions more con- 
cise. 

Using condition (13) we now derive a spectral decomposition for the ma- 
trix P. This result is well known [see, e.g., Green and Han (1992)], but our 
derivation contains some new elements and will allow us to introduce most 
of the notation needed for the sequel in a smooth way. Call D the diagonal 
matrix with (diagonal) elements yfjvl, where i runs through (E,<). Condi- 
tion (13) says that D 2 P is symmetric, and hence, DPD v is symmetric too. 
Since the last product shares its spectrum with P, the transition matrix has 
real eigenvalues A/, and further admits the spectral representation 

(14) P = ir 1 rdiag[Ai]r*Z>, 

where T is orthogonal. Furthermore, since P is irreducible and aperiodic, 
from the Perron-Frobenius theorem we know that the largest eigenvalue 
Ai = 1 strictly dominates in modulus any other eigenvalue and also the cor- 
responding eigenvector in (14) is positive. In fact, using (14) and stationarity 
we get 

D- 1 Tdiag[S[]T t D = A, 

where A is the limiting matrix (Aij) = ttj and 5[ is Kronecker's delta. So 
far, all this is well known. Now, let A be the second largest eigenvalue of P, 
Ao = max(0, A) and consider 

n ^ Q = J D" 1 rdiag[max(A ,A i )]r*D 

1 j =A I + (1-A )A 

Clearly the rows of Q sum to 1 and the off-diagonal elements are positive, 
hence it is a good candidate to be a stochastic matrix, with Q being a convex 
linear combination of stochastic matrices. It only remains to show that the 
diagonal elements are always nonnegative. Since D(Q - P)D~ 1 is positive 
semidefinite, the desired result follows from 

qu ~ p l i = ejD(Q - P)D~ 1 e i > 0, 

where is the corresponding canonical vector. Observe that since the map 
A; i — ^ max(Ao, A/) leaves the largest eigenvalue unchanged, tt is stationary for 
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Q-alternatively check it directly from (15). When A > we have A = Ao so 
the largest and the second largest eigenvalues of Q and the ones of P are the 
same. Since Tr(Q) = 1 + A(l — |-E7|) > 0, where \E\ is the cardinality of E, we 
have that A > — (\E\ — 1) ; hence when dealing with arbitrarily large chains, 
Ao and A cannot be very far apart. A crucial property of these transitions is 
the preservation of the Markov property under any transformation. 

Here we give a simple construction for deriving a Markov chain (XL) 
with transition probabilities Q. Consider Ao £ [0,1), vr and E as fixed but 
arbitrary and let (!&) and (Z k ) be independent sequences of i.i.d. random 
variables with respective distribution Bernoulli(l — Ao) and it on E. Let 
I[ = l and I' k = Ik for k > 1 . It is easy to verify that the construction 

e { n (i-^)Wi 

{j:l<j<k}{{£:j<£<k} ) 

works. Moreover, if we set 

n u)= e { n (i-^)k, 

{k:j<k<n}l{£:j<e<k} ) 

then 

n n 

(i6) j2f( x 'k) = J2 N (j)f( z j) 

k=i j=i 

with independence between -/V(l), . . . , N(n) and f{Zi), . . . , f(Z n ). It is plain 
from this representation that applying a transformation on the observations 
only amounts to changing the distribution of the i.i.d. Z's, that is, changing 
E and ir. Since (X' k ) is Markovian regardless of E and it, any transformation 
will preserve the Markov property. 

Our goal now will be to relate the moment generating function E 7r [tS' n ] to 
its 0-1 counterpart studied previously. This will be done in two steps. First, 
we compare S n = J2k=i f(Xk) with the nth partial sum S' n of the chain 
(f(X' k )), where (X' k ) has transitions Q. Second, we establish a stochastic 
majorization property for (X' k ) that will enable us to relate it to the two- 
state case. 

2.1. Step 1. As seen previously for the two-state case, the moment- 
generating function of the partial sums S n can be written as 

E n [eMtSn)} = ir'[PD?} n l, 

where Dt is the diagonal matrix with entries exp(t/(i)/2). Since tt is sta- 
tionary for P and since diagonal matrices commute, from the spectral rep- 
resentation (14) we get 

7r'[PA 2 fl = Tt'DtlDtPDtY 1 - 1 D t l 
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= tt' DtD- 1 [D t F diag[Ai]r* D t ] n - X DD t l 
= j , 1 D t G t n - 1 D tJl , 

where ~y 1 = (y^i ) is the first column of T and G t = D t T diag[A/]r' A- Since G t 
has nonnegative entries and is irreducible, from the Perron-Frobenius theo- 
rem the largest eigenvalue ((t) satisfies ((t) k = sup^ =1 \\G^\\ and the same 
argument that we used to obtain (7) yields 

TiAGt^ATi^llATxfCW' 1 - 1 . 

If we introduce H t = Ardiag[max(Ao, Ai)]r* A) and denote its largest eigen- 
value by rj(t), then Ht — Gt is positive semidefinite and n(t) dominates £(t) 
so that 

llA7ill 2 C(t) ft_1 <llA7ill 2 »7(t) ,l ~ 1 - 



Proposition 2. The large deviation probabilities satisfy 

(17) F^Sn > n(n + e)} < || A7if C(<) -1 exp{-n[t(/i + e) - logC(*)]} 

(18) < ||A7i|lV<) _1 exp{-n[*(/i + e)-log^)]}. 
Since \$ > 0, we further have 

r^Sn>n{fi + e)}<exp{-n[t(n + e)-logr]{t)]}. 

PROOF. Markov's inequality together with (17) and (18) imply the two 
first inequalities. Now, just as in the 0-1 case, the condition Ao > guaran- 
tees rj(t) > ||A7il| 2 and the last inequality holds. □ 

2.2. Step 2. 

Theorem 2. Let (X' k ) have transitions Q and let E n [f(X' k )] = fx. Then 
for any convex function \1/ : R — ► M, we /love 

E^(f(X[) + ■■■ + f(X' n ))} < E^(Y 1 +- + Y n )], 

where is the two-state chain with transitions M(Ao,aO- 

Proof. The proof is based on the representation (16) and a construc- 
tion. We introduce random variables Bj, j = 1, ...,n and we consider a 
joint distribution on (Bj,Zj) such that the conditional distribution of Bj, 
given Zj, is a Bernoulli distribution with mean f(Zj). We assume that 
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[B\, Zi), . . . , (B n , Z n ) are independent. Therefore, the marginal distribution 



of Bj is Bernoulli(/Li) for j = 1, 



. n . 



As in expression (16) we set 



Y, 



E { n 

l<7<fe> \U: j<£<k\ ) 



so 



{j-i<j<k} {{e-.j<e<k} 

n n 
k=l j=l 

Jensen's inequality combined with a conditional expectation tells us that 

N,Z 

Taking expectation on both sides, we obtain that 




□ 



Remark 1. In the above proof, it is implicit that the endpoints of f(E) 
are a = and 6=1. When a < b are arbitrary, the corresponding extremal 
chain lives in {a,b} and the transitions are determined by 



mab = (1 - A ) 



(i — a 
b — a ' 



m ba 



(1-A )- 



/' 



Theorem 2 deals with stochastic ordering of random variables. The particular 
stochastic order used here is known in the literature as the convex ordering: 
X ^ Y if E[V(X)] < E[*(y)] for all convex real valued * (such that the 

expectations exist). The result can be stated as X[ H \-X' n X Y\ H hY^. 

Observe that under stationarity X£ ^ Y^, so we have transition schemes 
Q and M under which the stochastic order relation is preserved for the 
respective partial sums. When there is independency within each sequence, 
it is known that the convex ordering of the marginals implies the same 
ordering for the corresponding partial sums [see Marshall and Olkin (1979)]; 
our result shows that this preservation property can occur in the Markovian 
setting as well. 

We now have all the necessary tools to prove our main result. 



Proof of Theorem 1. Applying Theorem 2 with *(x) = exp(tx), we 



get 



rj(t) < lim n _1 logE 7r {exp(t5^ l )} 
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= 6(t) 

and then from Proposition 2 we obtain 

Ptt[5„ >n(n + e)] < inf exp{-n[% + e) -log 77(f)]} 

< infexp{-n[t(/i + e) -log0(i)]} 

= exp{-n/6)(/i + e)}, 

where i# is the rate function (12). The stated upper bounds then follow from 
Proposition 1. □ 

Remark 2. With a little more effort we can see that we do not need to 
assume that P is aperiodic in Theorem 1. Indeed, given periodic but irre- 
ducible and reversible P, it is possible to construct a sequence of aperiodic, 
irreducible and reversible chains P m , such that P m converges to P as m 
tends to infinity. Since A, tt and \i are continuous functions of P, Theorem 
1 will hold for all P m and the result will hold for P as well, by continuity. In 
fact, an eigenvalue near of even equal to —1 is not a problem as only Cesaro 
sums S n are considered here. 

Remark 3. Following Remark 1, the theorem remains valid when the 
end points a < b of f(E) are arbitrary. In this case the values \x and e in the 
bounds are to be replaced by an d ^§a' res P ec tively. 

It is clear from the proof of Theorem 1 that, under the condition A > 0, 
the rate functions if (a:) = sup 4gR {ix — log £(£)}> i»?( x ) = sn PteM.{tx — log r](t)} 
and I$(x) =sup te ^{tx — log 6>(i)} corresponding to S n ,S' n and J2k=iYk, re- 
spectively, satisfy 



When P = Q, there is equality on the leftmost side. Furthermore, when / 
is 0-1, we have equality in the rightmost side. Hence, when P = Q and 
/ :E —> {0, 1}, the exponential rate given in our first upper bound cannot be 
improved upon. In particular, when the chain is independent, the theorem 
yields the well-known Hoeffding's inequality 



(19) 
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Remark 4. A closer look reveals that the leftmost inequality in (19) is 
true for all A. This suggests the possibility that the theorem might be true 
for all admissible values of A. But this is not so, numerical evidence show 
that the bounds do not hold without the condition A > 0. 



3. Comparisons. Gillman (1993) was the first to obtain a finite sample 
size exponential bound for the large deviation probabilities using pertur- 
bation theory. Successive refinements of the technique allowed Dinwoodie 
(1995) and Lezaud (1998) to improve this bound. Among these, the later 
work contains the best results and we shall use them for the comparisons. 
With / satisfying our usual assumptions, Theorem 1.1 of Lezaud (1998) 
gives in our particular case 

(20) ,„[*. > „(„ + £) ] < e<->/s exp {_ _|l^_ }, 

where h(x) = yT — x — (1 — x). Let us denote L(fj,,e) the exponential rate in 
(20) and Ig(fi-\-e) is the exponential rate in the bound (2). Observe first that 
since Ig(p, + e) comes from the rate function of the two-state case, and since 
[fi + e, oo) is a continuity set of Ig, then sampling from this chain implies 

lim rT 1 log P M [S n > n(n + e)} = -I g (fi + e) < -L(fi,e), 

n — ^oo 

hence, when A > 0, the rate Ig{n + e) always yields a better bound. A limited 
Taylor expansion of Ig around /i gives an idea of the ratio of these quantities 

h^ + e) 2 
L(jjl,e) /2(1 + A) + 1 >■ 

APPENDIX A 

The leading, middle and constant terms of the convex quadratic polyno- 
mial obtained from (15) are 

a=[l-(2x-l) 2 ](fi + flX), 

b = -2{[/x/2(l - A) 2 + A][l + (2x - l) 2 ] - 2X(2x - l) 2 } 



and 



c = (/2 + /xA) 2 [l-(2x-l) 



respectively. After some simplifications the discriminant b 2 — 4ac can be 
written as 



2i2 



16(2x - 1) [/i/2(l - A) ] 



4Ax(l - x) 
1 + ^(1-A)2 



> 
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and the roots £j,i are S^ ven ' respectively, by 

fifi(l - A) 2 [l + (2a - l) 2 ] + Ajl - (2a - l) 2 ] 
, . [l-(2x-l) 2 ]( M + /2A) 

1 > 2(2a-l)/i/2(l-A) 2 y / A 

[l-(2a-l)2](/z + /ZA) • 

Now, consider the conjugate product 

[v / A + (l-2a)][v / A-(l-2a)] 

_ (/i + /2A)(/2 + / iA)[l-(2a-l) 2 ] 
W(l-A) 2 

Since it is positive for < a < 1 and since both terms on the left-hand 
side are positive at a = 1/2 and continuous, each is positive for all < a < 
1. Multiplying the numerator and denominator in the expression (21) by 
y/A + (1 — 2a), for £q, and by y/~A — (1 — 2a), for £q , the roots can be 
written as 

(/2 + MA)[VA - (1 - 2a)] (fi + fiX) [VA + (1 - 2a)] 



(^ + /zA)[VA + (l-2a)] 



(/i + /2A)[VA-(l-2a)] 



Except for a = 1/2, where they coincide, exactly one of these is the solution 
of (11), the other being the solution to the conjugate equation. To arbitrate, 
let us, evaluate the rightmost term in (11) for the first candidate. We obtain 



(p+fixy 



(jS + juA)[VA-(l-2a)] 



(/i + /iA)[v / A + (l-2a)] 



(JJL + liX) 



2(2a - l)(/2 + //A) 
v / A + (l-2a) 



Since this expression shares its sign with the leftmost term in (11), we have 
found the maximizing value. 

APPENDIX B 

The behavior of h{x) = (a — fj,)lL(x) — 2Iq(x) depends on whether [i < 1/2 
or fi > 1/2; we shall carry out the analysis for the first case, the other being 
similar but somewhat less involved. To begin with, the first derivatives of Ig 
are found to be 



li 3) (x) 



H + /iA 



l-2x/(l + \/A) 



+ log 



H + /iA 



l-2a/(l + v / A) 



(v Aaa) , 
(a-a)(3A 



1) 



2A 3 / 2 ( 



XX 
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Fig. 1. 



so that Ig(fJ,) = I'e{y) = 0, while I e (x) oc (x — x), since the other terms are 

positive. Now, we have h'(x) = (x — fi)I' & (x) — I'g(x), h"(x) = (x — fi)Ig (x), 
and Figure 1 summarizes the analysis of their sign. 

Combining this with the fact that h(/j,) = h(fl) = 0, we see that these are 
the only zeros and further, h is negative in (fJ,,fl) and positive in (0,/i) U 
(£,!)■ " 
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